Overview

Dataset statistics

 Original DatasetOversampled Dataset
Number of variables1515
Number of observations188862
Missing cells00
Missing cells (%)0.0%0.0%
Duplicate rows0175
Duplicate rows (%)0.0%20.3%
Total size in memory22.2 KiB107.8 KiB
Average record size in memory120.7 B128.0 B

Variable types

 Original DatasetOversampled Dataset
Categorical88
Numeric77

Alerts

Original DatasetOversampled Dataset
nozzle_voltage is highly overall correlated with velocity and 1 other fieldsAlert not present in High Correlation
drop_spacing is highly overall correlated with line_width and 1 other fieldsAlert not present in High Correlation
time is highly overall correlated with distanceAlert not present in High Correlation
velocity is highly overall correlated with nozzle_voltageAlert not present in High Correlation
line_width is highly overall correlated with drop_spacing and 1 other fieldsAlert not present in High Correlation
roughness is highly overall correlated with drop_spacing and 1 other fieldsAlert not present in High Correlation
print_height is highly overall correlated with ink_visco_cp and 5 other fieldsAlert not present in High Correlation
distance is highly overall correlated with nozzle_voltage and 1 other fieldsAlert not present in High Correlation
ink_visco_cp is highly overall correlated with print_height and 5 other fieldsAlert not present in High Correlation
ink_visco_pas is highly overall correlated with print_height and 5 other fieldsAlert not present in High Correlation
surface_tension_dyne_cm is highly overall correlated with print_height and 5 other fieldsAlert not present in High Correlation
surface_tension_n_m is highly overall correlated with print_height and 5 other fieldsAlert not present in High Correlation
ink _density is highly overall correlated with print_height and 5 other fieldsAlert not present in High Correlation
z_number is highly overall correlated with print_height and 5 other fieldsAlert not present in High Correlation
overspray has 8 (4.3%) zeros overspray has 21 (2.4%) zeros Zeros
Alert not present in Dataset has 175 (20.3%) duplicate rowsDuplicates
Alert not present in print_height has a high cardinality: 91 distinct values High Cardinality
Alert not present in distance has a high cardinality: 93 distinct values High Cardinality
Alert not present in ink_visco_cp has a high cardinality: 220 distinct values High Cardinality
Alert not present in ink_visco_pas has a high cardinality: 220 distinct values High Cardinality
Alert not present in surface_tension_dyne_cm has a high cardinality: 220 distinct values High Cardinality
Alert not present in surface_tension_n_m has a high cardinality: 220 distinct values High Cardinality
Alert not present in ink _density has a high cardinality: 51 distinct values High Cardinality
Alert not present in z_number has a high cardinality: 220 distinct values High Cardinality
Alert not present in distance is highly imbalanced (55.0%) Imbalance
Alert not present in ink_visco_cp is highly imbalanced (56.4%) Imbalance
Alert not present in ink_visco_pas is highly imbalanced (56.4%) Imbalance
Alert not present in surface_tension_dyne_cm is highly imbalanced (56.4%) Imbalance
Alert not present in surface_tension_n_m is highly imbalanced (56.4%) Imbalance
Alert not present in ink _density is highly imbalanced (57.4%) Imbalance
Alert not present in z_number is highly imbalanced (56.4%) Imbalance

Reproduction

 Original DatasetOversampled Dataset
Analysis started2023-05-08 07:57:54.4162572023-05-08 07:57:59.507828
Analysis finished2023-05-08 07:57:59.4935242023-05-08 07:58:06.736570
Duration5.08 seconds7.23 seconds
Software versionydata-profiling vv4.1.2ydata-profiling vv4.1.2
Download configurationconfig.jsonconfig.json

Variables

print_height
Categorical

 Original DatasetOversampled Dataset
Distinct491
Distinct (%)2.1%10.6%
Missing00
Missing (%)0.0%0.0%
Memory size1.6 KiB13.5 KiB
800
55 
750
53 
700
51 
650
29 
800
206 
700
167 
750
165 
650
106 
698
 
8
Other values (86)
210 

Length

 Original DatasetOversampled Dataset
Max length33
Median length33
Mean length33
Min length33

Characters and Unicode

 Original DatasetOversampled Dataset
Total characters5642586
Distinct characters510
Distinct categories11 ?
Distinct scripts11 ?
Distinct blocks11 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

 Original DatasetOversampled Dataset
Unique038 ?
Unique (%)0.0%4.4%

Sample

 Original DatasetOversampled Dataset
1st row800702
2nd row800702
3rd row800696
4th row800703
5th row800698

Common Values

ValueCountFrequency (%)
800 55
29.3%
750 53
28.2%
700 51
27.1%
650 29
15.4%
ValueCountFrequency (%)
800 206
23.9%
700 167
19.4%
750 165
19.1%
650 106
12.3%
698 8
 
0.9%
752 8
 
0.9%
751 7
 
0.8%
801 6
 
0.7%
749 6
 
0.7%
702 5
 
0.6%
Other values (81) 178
20.6%

Length

2023-05-08T01:58:06.823065image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

Original Dataset

2023-05-08T01:58:06.922796image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Oversampled Dataset


Number of variable categories passes threshold (config.plot.cat_freq.max_unique)
ValueCountFrequency (%)
800 55
29.3%
750 53
28.2%
700 51
27.1%
650 29
15.4%
ValueCountFrequency (%)
800 206
23.9%
700 167
19.4%
750 165
19.1%
650 106
12.3%
698 8
 
0.9%
752 8
 
0.9%
751 7
 
0.8%
801 6
 
0.7%
749 6
 
0.7%
696 5
 
0.6%
Other values (81) 178
20.6%

Most occurring characters

ValueCountFrequency (%)
0 294
52.1%
7 104
 
18.4%
5 82
 
14.5%
8 55
 
9.8%
6 29
 
5.1%
ValueCountFrequency (%)
0 1056
40.8%
7 493
19.1%
5 337
 
13.0%
8 258
 
10.0%
6 205
 
7.9%
9 80
 
3.1%
4 69
 
2.7%
1 32
 
1.2%
2 31
 
1.2%
3 25
 
1.0%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 564
100.0%
ValueCountFrequency (%)
Decimal Number 2586
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 294
52.1%
7 104
 
18.4%
5 82
 
14.5%
8 55
 
9.8%
6 29
 
5.1%
ValueCountFrequency (%)
0 1056
40.8%
7 493
19.1%
5 337
 
13.0%
8 258
 
10.0%
6 205
 
7.9%
9 80
 
3.1%
4 69
 
2.7%
1 32
 
1.2%
2 31
 
1.2%
3 25
 
1.0%

Most occurring scripts

ValueCountFrequency (%)
Common 564
100.0%
ValueCountFrequency (%)
Common 2586
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
0 294
52.1%
7 104
 
18.4%
5 82
 
14.5%
8 55
 
9.8%
6 29
 
5.1%
ValueCountFrequency (%)
0 1056
40.8%
7 493
19.1%
5 337
 
13.0%
8 258
 
10.0%
6 205
 
7.9%
9 80
 
3.1%
4 69
 
2.7%
1 32
 
1.2%
2 31
 
1.2%
3 25
 
1.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 564
100.0%
ValueCountFrequency (%)
ASCII 2586
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 294
52.1%
7 104
 
18.4%
5 82
 
14.5%
8 55
 
9.8%
6 29
 
5.1%
ValueCountFrequency (%)
0 1056
40.8%
7 493
19.1%
5 337
 
13.0%
8 258
 
10.0%
6 205
 
7.9%
9 80
 
3.1%
4 69
 
2.7%
1 32
 
1.2%
2 31
 
1.2%
3 25
 
1.0%

nozzle_voltage
Real number (ℝ)

 Original DatasetOversampled Dataset
Distinct618
Distinct (%)3.2%2.1%
Missing00
Missing (%)0.0%0.0%
Infinite00
Infinite (%)0.0%0.0%
Mean31.73404331.954756
 Original DatasetOversampled Dataset
Minimum2524
Maximum4041
Zeros00
Zeros (%)0.0%0.0%
Negative00
Negative (%)0.0%0.0%
Memory size1.6 KiB13.5 KiB
2023-05-08T01:58:07.020124image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Quantile statistics

 Original DatasetOversampled Dataset
Minimum2524
5-th percentile2525
Q12828
median3131
Q33737
95-th percentile4040
Maximum4041
Range1517
Interquartile range (IQR)99

Descriptive statistics

 Original DatasetOversampled Dataset
Standard deviation5.05409364.8385352
Coefficient of variation (CV)0.159264090.15141831
Kurtosis-1.1485165-1.1798426
Mean31.73404331.954756
Median Absolute Deviation (MAD)33
Skewness0.266389630.19087405
Sum596627545
Variance25.54386223.411423
MonotonicityNot monotonicNot monotonic
2023-05-08T01:58:07.125279image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram with fixed size bins (bins=6)
ValueCountFrequency (%)
31 39
20.7%
28 38
20.2%
25 35
18.6%
40 27
14.4%
34 25
13.3%
37 24
12.8%
ValueCountFrequency (%)
28 159
18.4%
31 130
15.1%
37 108
12.5%
25 103
11.9%
34 100
11.6%
40 94
10.9%
27 37
 
4.3%
36 32
 
3.7%
33 23
 
2.7%
30 22
 
2.6%
Other values (8) 54
 
6.3%
ValueCountFrequency (%)
25 35
18.6%
28 38
20.2%
31 39
20.7%
34 25
13.3%
37 24
12.8%
40 27
14.4%
ValueCountFrequency (%)
24 4
 
0.5%
25 103
11.9%
26 13
 
1.5%
27 37
 
4.3%
28 159
18.4%
29 5
 
0.6%
30 22
 
2.6%
31 130
15.1%
32 4
 
0.5%
33 23
 
2.7%
ValueCountFrequency (%)
24 4
 
2.1%
25 103
54.8%
26 13
 
6.9%
27 37
 
19.7%
28 159
84.6%
29 5
 
2.7%
30 22
 
11.7%
31 130
69.1%
32 4
 
2.1%
33 23
 
12.2%
ValueCountFrequency (%)
25 35
4.1%
28 38
4.4%
31 39
4.5%
34 25
2.9%
37 24
2.8%
40 27
3.1%

drop_spacing
Real number (ℝ)

 Original DatasetOversampled Dataset
Distinct1011
Distinct (%)5.3%1.3%
Missing00
Missing (%)0.0%0.0%
Infinite00
Infinite (%)0.0%0.0%
Mean12.44680911.087007
 Original DatasetOversampled Dataset
Minimum87
Maximum1717
Zeros00
Zeros (%)0.0%0.0%
Negative00
Negative (%)0.0%0.0%
Memory size1.6 KiB13.5 KiB
2023-05-08T01:58:07.231778image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Quantile statistics

 Original DatasetOversampled Dataset
Minimum87
5-th percentile88
Q1109
median1210
Q31513
95-th percentile1717
Maximum1717
Range910
Interquartile range (IQR)54

Descriptive statistics

 Original DatasetOversampled Dataset
Standard deviation2.8924282.9242219
Coefficient of variation (CV)0.23238310.26375215
Kurtosis-1.2234632-0.81352848
Mean12.44680911.087007
Median Absolute Deviation (MAD)22
Skewness0.0625819130.6182005
Sum23409557
Variance8.36613958.5510737
MonotonicityNot monotonicNot monotonic
2023-05-08T01:58:07.323405image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%)
9 21
11.2%
12 20
10.6%
17 20
10.6%
10 19
10.1%
11 19
10.1%
13 19
10.1%
16 19
10.1%
8 18
9.6%
14 18
9.6%
15 15
8.0%
ValueCountFrequency (%)
8 166
19.3%
9 145
16.8%
10 110
12.8%
11 91
10.6%
13 62
 
7.2%
12 57
 
6.6%
17 54
 
6.3%
16 52
 
6.0%
14 52
 
6.0%
15 43
 
5.0%
ValueCountFrequency (%)
8 18
9.6%
9 21
11.2%
10 19
10.1%
11 19
10.1%
12 20
10.6%
13 19
10.1%
14 18
9.6%
15 15
8.0%
16 19
10.1%
17 20
10.6%
ValueCountFrequency (%)
7 30
 
3.5%
8 166
19.3%
9 145
16.8%
10 110
12.8%
11 91
10.6%
12 57
 
6.6%
13 62
 
7.2%
14 52
 
6.0%
15 43
 
5.0%
16 52
 
6.0%
ValueCountFrequency (%)
7 30
 
16.0%
8 166
88.3%
9 145
77.1%
10 110
58.5%
11 91
48.4%
12 57
 
30.3%
13 62
 
33.0%
14 52
 
27.7%
15 43
 
22.9%
16 52
 
27.7%
ValueCountFrequency (%)
8 18
2.1%
9 21
2.4%
10 19
2.2%
11 19
2.2%
12 20
2.3%
13 19
2.2%
14 18
2.1%
15 15
1.7%
16 19
2.2%
17 20
2.3%

distance
Categorical

 Original DatasetOversampled Dataset
Distinct393
Distinct (%)1.6%10.8%
Missing00
Missing (%)0.0%0.0%
Memory size1.6 KiB13.5 KiB
900
139 
300
47 
270
 
2
900
486 
300
164 
270
 
7
885
 
7
905
 
7
Other values (88)
191 

Length

 Original DatasetOversampled Dataset
Max length33
Median length33
Mean length33
Min length33

Characters and Unicode

 Original DatasetOversampled Dataset
Total characters5642586
Distinct characters510
Distinct categories11 ?
Distinct scripts11 ?
Distinct blocks11 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

 Original DatasetOversampled Dataset
Unique039 ?
Unique (%)0.0%4.5%

Sample

 Original DatasetOversampled Dataset
1st row270885
2nd row270932
3rd row300905
4th row300916
5th row300890

Common Values

ValueCountFrequency (%)
900 139
73.9%
300 47
 
25.0%
270 2
 
1.1%
ValueCountFrequency (%)
900 486
56.4%
300 164
 
19.0%
270 7
 
0.8%
885 7
 
0.8%
905 7
 
0.8%
899 7
 
0.8%
911 5
 
0.6%
894 5
 
0.6%
897 5
 
0.6%
917 5
 
0.6%
Other values (83) 164
 
19.0%

Length

2023-05-08T01:58:07.418599image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

Original Dataset

2023-05-08T01:58:07.499081image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Oversampled Dataset


Number of variable categories passes threshold (config.plot.cat_freq.max_unique)
ValueCountFrequency (%)
900 139
73.9%
300 47
 
25.0%
270 2
 
1.1%
ValueCountFrequency (%)
900 486
56.4%
300 164
 
19.0%
270 7
 
0.8%
885 7
 
0.8%
905 7
 
0.8%
899 7
 
0.8%
911 5
 
0.6%
894 5
 
0.6%
897 5
 
0.6%
917 5
 
0.6%
Other values (83) 164
 
19.0%

Most occurring characters

ValueCountFrequency (%)
0 374
66.3%
9 139
 
24.6%
3 47
 
8.3%
2 2
 
0.4%
7 2
 
0.4%
ValueCountFrequency (%)
0 1347
52.1%
9 629
24.3%
3 197
 
7.6%
8 152
 
5.9%
2 70
 
2.7%
7 54
 
2.1%
1 54
 
2.1%
6 31
 
1.2%
5 28
 
1.1%
4 24
 
0.9%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 564
100.0%
ValueCountFrequency (%)
Decimal Number 2586
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 374
66.3%
9 139
 
24.6%
3 47
 
8.3%
2 2
 
0.4%
7 2
 
0.4%
ValueCountFrequency (%)
0 1347
52.1%
9 629
24.3%
3 197
 
7.6%
8 152
 
5.9%
2 70
 
2.7%
7 54
 
2.1%
1 54
 
2.1%
6 31
 
1.2%
5 28
 
1.1%
4 24
 
0.9%

Most occurring scripts

ValueCountFrequency (%)
Common 564
100.0%
ValueCountFrequency (%)
Common 2586
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
0 374
66.3%
9 139
 
24.6%
3 47
 
8.3%
2 2
 
0.4%
7 2
 
0.4%
ValueCountFrequency (%)
0 1347
52.1%
9 629
24.3%
3 197
 
7.6%
8 152
 
5.9%
2 70
 
2.7%
7 54
 
2.1%
1 54
 
2.1%
6 31
 
1.2%
5 28
 
1.1%
4 24
 
0.9%

Most occurring blocks

ValueCountFrequency (%)
ASCII 564
100.0%
ValueCountFrequency (%)
ASCII 2586
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 374
66.3%
9 139
 
24.6%
3 47
 
8.3%
2 2
 
0.4%
7 2
 
0.4%
ValueCountFrequency (%)
0 1347
52.1%
9 629
24.3%
3 197
 
7.6%
8 152
 
5.9%
2 70
 
2.7%
7 54
 
2.1%
1 54
 
2.1%
6 31
 
1.2%
5 28
 
1.1%
4 24
 
0.9%

time
Real number (ℝ)

 Original DatasetOversampled Dataset
Distinct63418
Distinct (%)33.5%48.5%
Missing00
Missing (%)0.0%0.0%
Infinite00
Infinite (%)0.0%0.0%
Mean71.13829870.545258
 Original DatasetOversampled Dataset
Minimum3129.37895
Maximum130130
Zeros00
Zeros (%)0.0%0.0%
Negative00
Negative (%)0.0%0.0%
Memory size1.6 KiB13.5 KiB
2023-05-08T01:58:07.614198image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Quantile statistics

 Original DatasetOversampled Dataset
Minimum3129.37895
5-th percentile3434
Q14558.190627
median6969
Q389.2588.886167
95-th percentile111.25108
Maximum130130
Range99100.62105
Interquartile range (IQR)44.2530.69554

Descriptive statistics

 Original DatasetOversampled Dataset
Standard deviation24.6882623.94049
Coefficient of variation (CV)0.347045980.33936357
Kurtosis-0.82393317-0.86428222
Mean71.13829870.545258
Median Absolute Deviation (MAD)21.519
Skewness0.169269310.068783337
Sum1337460810.012
Variance609.51018573.14707
MonotonicityNot monotonicNot monotonic
2023-05-08T01:58:07.762620image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
78 9
 
4.8%
44 9
 
4.8%
66 8
 
4.3%
96 8
 
4.3%
38 8
 
4.3%
63 7
 
3.7%
107 6
 
3.2%
61 6
 
3.2%
83 5
 
2.7%
60 5
 
2.7%
Other values (53) 117
62.2%
ValueCountFrequency (%)
78 30
 
3.5%
61 26
 
3.0%
44 21
 
2.4%
34 21
 
2.4%
66 20
 
2.3%
96 19
 
2.2%
38 18
 
2.1%
63 17
 
2.0%
107 17
 
2.0%
83 13
 
1.5%
Other values (408) 660
76.6%
ValueCountFrequency (%)
31 2
 
1.1%
32 4
2.1%
34 5
2.7%
35 1
 
0.5%
36 2
 
1.1%
37 2
 
1.1%
38 8
4.3%
39 1
 
0.5%
40 4
2.1%
41 2
 
1.1%
ValueCountFrequency (%)
29.37895013 1
 
0.1%
31 5
0.6%
31.30534251 1
 
0.1%
31.46443976 1
 
0.1%
31.62718419 1
 
0.1%
31.76417297 1
 
0.1%
31.99393802 1
 
0.1%
32 11
1.3%
32.08223617 1
 
0.1%
32.56794451 1
 
0.1%
ValueCountFrequency (%)
29.37895013 1
 
0.5%
31 5
2.7%
31.30534251 1
 
0.5%
31.46443976 1
 
0.5%
31.62718419 1
 
0.5%
31.76417297 1
 
0.5%
31.99393802 1
 
0.5%
32 11
5.9%
32.08223617 1
 
0.5%
32.56794451 1
 
0.5%
ValueCountFrequency (%)
31 2
 
0.2%
32 4
0.5%
34 5
0.6%
35 1
 
0.1%
36 2
 
0.2%
37 2
 
0.2%
38 8
0.9%
39 1
 
0.1%
40 4
0.5%
41 2
 
0.2%

velocity
Real number (ℝ)

 Original DatasetOversampled Dataset
Distinct73436
Distinct (%)38.8%50.6%
Missing00
Missing (%)0.0%0.0%
Infinite00
Infinite (%)0.0%0.0%
Mean10.46242810.698038
 Original DatasetOversampled Dataset
Minimum6.6676.5215865
Maximum15.51724115.517241
Zeros00
Zeros (%)0.0%0.0%
Negative00
Negative (%)0.0%0.0%
Memory size1.6 KiB13.5 KiB
2023-05-08T01:58:07.914988image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Quantile statistics

 Original DatasetOversampled Dataset
Minimum6.6676.5215865
5-th percentile6.8186.977
Q18.276758.411215
median9.94510.229517
Q312.903513.061486
95-th percentile14.913914.881777
Maximum15.51724115.517241
Range8.85024148.9956549
Interquartile range (IQR)4.626754.6502713

Descriptive statistics

 Original DatasetOversampled Dataset
Standard deviation2.63906372.563894
Coefficient of variation (CV)0.252241990.2396602
Kurtosis-1.181831-1.2513327
Mean10.46242810.698038
Median Absolute Deviation (MAD)2.052.0994828
Skewness0.320231160.24287202
Sum1966.93659221.7091
Variance6.96465736.5735526
MonotonicityNot monotonicNot monotonic
2023-05-08T01:58:08.063671image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
9.375 12
 
6.4%
6.818 9
 
4.8%
11.538 9
 
4.8%
13.636 6
 
3.2%
7.895 6
 
3.2%
14.754 6
 
3.2%
10.843 5
 
2.7%
15 5
 
2.7%
8.411214953 5
 
2.7%
10.345 5
 
2.7%
Other values (63) 120
63.8%
ValueCountFrequency (%)
11.538 30
 
3.5%
9.375 30
 
3.5%
14.754 26
 
3.0%
6.818 21
 
2.4%
13.636 14
 
1.6%
8.411214953 14
 
1.6%
7.895 13
 
1.5%
15 13
 
1.5%
10.843 13
 
1.5%
13.043 11
 
1.3%
Other values (426) 677
78.5%
ValueCountFrequency (%)
6.667 4
2.1%
6.818 9
4.8%
6.923 2
 
1.1%
6.976744186 1
 
0.5%
6.977 2
 
1.1%
7.142857143 1
 
0.5%
7.143 2
 
1.1%
7.317 1
 
0.5%
7.317073171 1
 
0.5%
7.5 4
2.1%
ValueCountFrequency (%)
6.521586489 1
 
0.1%
6.667 10
1.2%
6.693525798 1
 
0.1%
6.818 21
2.4%
6.868515188 1
 
0.1%
6.911431133 1
 
0.1%
6.912361405 1
 
0.1%
6.923 4
 
0.5%
6.966472375 1
 
0.1%
6.976744186 1
 
0.1%
ValueCountFrequency (%)
6.521586489 1
 
0.5%
6.667 10
5.3%
6.693525798 1
 
0.5%
6.818 21
11.2%
6.868515188 1
 
0.5%
6.911431133 1
 
0.5%
6.912361405 1
 
0.5%
6.923 4
 
2.1%
6.966472375 1
 
0.5%
6.976744186 1
 
0.5%
ValueCountFrequency (%)
6.667 4
0.5%
6.818 9
1.0%
6.923 2
 
0.2%
6.976744186 1
 
0.1%
6.977 2
 
0.2%
7.142857143 1
 
0.1%
7.143 2
 
0.2%
7.317 1
 
0.1%
7.317073171 1
 
0.1%
7.5 4
0.5%

ink_visco_cp
Categorical

 Original DatasetOversampled Dataset
Distinct2220
Distinct (%)1.1%25.5%
Missing00
Missing (%)0.0%0.0%
Memory size1.6 KiB13.5 KiB
6.9
140 
6.3
48 
6.9
478 
6.3
166 
6.888742912918097
 
1
6.303726622667084
 
1
6.904829801179935
 
1
Other values (215)
215 

Length

 Original DatasetOversampled Dataset
Max length318
Median length33
Mean length36.5382831
Min length33

Characters and Unicode

 Original DatasetOversampled Dataset
Total characters5645636
Distinct characters411
Distinct categories22 ?
Distinct scripts11 ?
Distinct blocks11 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

 Original DatasetOversampled Dataset
Unique0218 ?
Unique (%)0.0%25.3%

Sample

 Original DatasetOversampled Dataset
1st row6.36.888742912918097
2nd row6.36.893399508815485
3rd row6.36.887352564552868
4th row6.36.891240628049628
5th row6.36.910622627084694

Common Values

ValueCountFrequency (%)
6.9 140
74.5%
6.3 48
 
25.5%
ValueCountFrequency (%)
6.9 478
55.5%
6.3 166
 
19.3%
6.888742912918097 1
 
0.1%
6.303726622667084 1
 
0.1%
6.904829801179935 1
 
0.1%
6.884262069178429 1
 
0.1%
6.8932259500224955 1
 
0.1%
6.904776849219869 1
 
0.1%
6.903967139609121 1
 
0.1%
6.909098073100913 1
 
0.1%
Other values (210) 210
24.4%

Length

2023-05-08T01:58:08.183279image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

Original Dataset

2023-05-08T01:58:08.270411image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Oversampled Dataset


Number of variable categories passes threshold (config.plot.cat_freq.max_unique)
ValueCountFrequency (%)
6.9 140
74.5%
6.3 48
 
25.5%
ValueCountFrequency (%)
6.9 478
55.5%
6.3 166
 
19.3%
6.290514817904113 1
 
0.1%
6.309679703483194 1
 
0.1%
6.3052666129718355 1
 
0.1%
6.298005557129865 1
 
0.1%
6.29406572012702 1
 
0.1%
6.887352564552868 1
 
0.1%
6.891240628049628 1
 
0.1%
6.910622627084694 1
 
0.1%
Other values (210) 210
24.4%

Most occurring characters

ValueCountFrequency (%)
6 188
33.3%
. 188
33.3%
9 140
24.8%
3 48
 
8.5%
ValueCountFrequency (%)
6 1166
20.7%
9 898
15.9%
. 862
15.3%
3 483
8.6%
8 380
 
6.7%
2 348
 
6.2%
0 318
 
5.6%
1 310
 
5.5%
7 295
 
5.2%
4 294
 
5.2%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 376
66.7%
Other Punctuation 188
33.3%
ValueCountFrequency (%)
Decimal Number 4774
84.7%
Other Punctuation 862
 
15.3%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
6 188
50.0%
9 140
37.2%
3 48
 
12.8%
ValueCountFrequency (%)
6 1166
24.4%
9 898
18.8%
3 483
10.1%
8 380
 
8.0%
2 348
 
7.3%
0 318
 
6.7%
1 310
 
6.5%
7 295
 
6.2%
4 294
 
6.2%
5 282
 
5.9%
Other Punctuation
ValueCountFrequency (%)
. 188
100.0%
ValueCountFrequency (%)
. 862
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 564
100.0%
ValueCountFrequency (%)
Common 5636
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
6 188
33.3%
. 188
33.3%
9 140
24.8%
3 48
 
8.5%
ValueCountFrequency (%)
6 1166
20.7%
9 898
15.9%
. 862
15.3%
3 483
8.6%
8 380
 
6.7%
2 348
 
6.2%
0 318
 
5.6%
1 310
 
5.5%
7 295
 
5.2%
4 294
 
5.2%

Most occurring blocks

ValueCountFrequency (%)
ASCII 564
100.0%
ValueCountFrequency (%)
ASCII 5636
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
6 188
33.3%
. 188
33.3%
9 140
24.8%
3 48
 
8.5%
ValueCountFrequency (%)
6 1166
20.7%
9 898
15.9%
. 862
15.3%
3 483
8.6%
8 380
 
6.7%
2 348
 
6.2%
0 318
 
5.6%
1 310
 
5.5%
7 295
 
5.2%
4 294
 
5.2%

ink_visco_pas
Categorical

 Original DatasetOversampled Dataset
Distinct2220
Distinct (%)1.1%25.5%
Missing00
Missing (%)0.0%0.0%
Memory size1.6 KiB13.5 KiB
0.0069
140 
0.0063
48 
0.0069
478 
0.0063
166 
0.006903595461368387
 
1
0.006286979594108262
 
1
0.006884431627196074
 
1
Other values (215)
215 

Length

 Original DatasetOversampled Dataset
Max length621
Median length66
Mean length69.5556845
Min length66

Characters and Unicode

 Original DatasetOversampled Dataset
Total characters11288237
Distinct characters511
Distinct categories22 ?
Distinct scripts11 ?
Distinct blocks11 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

 Original DatasetOversampled Dataset
Unique0218 ?
Unique (%)0.0%25.3%

Sample

 Original DatasetOversampled Dataset
1st row0.00630.006903595461368387
2nd row0.00630.0068775336775216645
3rd row0.00630.0069227014084642415
4th row0.00630.006902995037935769
5th row0.00630.006890096759076731

Common Values

ValueCountFrequency (%)
0.0069 140
74.5%
0.0063 48
 
25.5%
ValueCountFrequency (%)
0.0069 478
55.5%
0.0063 166
 
19.3%
0.006903595461368387 1
 
0.1%
0.006286979594108262 1
 
0.1%
0.006884431627196074 1
 
0.1%
0.006904137999694823 1
 
0.1%
0.006895696467744304 1
 
0.1%
0.006914235760015329 1
 
0.1%
0.006920817222662482 1
 
0.1%
0.006913285273365126 1
 
0.1%
Other values (210) 210
24.4%

Length

2023-05-08T01:58:08.347752image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

Original Dataset

2023-05-08T01:58:08.432747image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Oversampled Dataset


Number of variable categories passes threshold (config.plot.cat_freq.max_unique)
ValueCountFrequency (%)
0.0069 140
74.5%
0.0063 48
 
25.5%
ValueCountFrequency (%)
0.0069 478
55.5%
0.0063 166
 
19.3%
0.006273510036654438 1
 
0.1%
0.006339592513855169 1
 
0.1%
0.006301127524764602 1
 
0.1%
0.00628356097808503 1
 
0.1%
0.006307181349639772 1
 
0.1%
0.0069227014084642415 1
 
0.1%
0.006902995037935769 1
 
0.1%
0.006890096759076731 1
 
0.1%
Other values (210) 210
24.4%

Most occurring characters

ValueCountFrequency (%)
0 564
50.0%
. 188
 
16.7%
6 188
 
16.7%
9 140
 
12.4%
3 48
 
4.3%
ValueCountFrequency (%)
0 2931
35.6%
6 1149
 
13.9%
9 871
 
10.6%
. 862
 
10.5%
3 522
 
6.3%
8 375
 
4.6%
2 314
 
3.8%
1 310
 
3.8%
5 307
 
3.7%
4 304
 
3.7%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 940
83.3%
Other Punctuation 188
 
16.7%
ValueCountFrequency (%)
Decimal Number 7375
89.5%
Other Punctuation 862
 
10.5%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 564
60.0%
6 188
 
20.0%
9 140
 
14.9%
3 48
 
5.1%
ValueCountFrequency (%)
0 2931
39.7%
6 1149
 
15.6%
9 871
 
11.8%
3 522
 
7.1%
8 375
 
5.1%
2 314
 
4.3%
1 310
 
4.2%
5 307
 
4.2%
4 304
 
4.1%
7 292
 
4.0%
Other Punctuation
ValueCountFrequency (%)
. 188
100.0%
ValueCountFrequency (%)
. 862
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 1128
100.0%
ValueCountFrequency (%)
Common 8237
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
0 564
50.0%
. 188
 
16.7%
6 188
 
16.7%
9 140
 
12.4%
3 48
 
4.3%
ValueCountFrequency (%)
0 2931
35.6%
6 1149
 
13.9%
9 871
 
10.6%
. 862
 
10.5%
3 522
 
6.3%
8 375
 
4.6%
2 314
 
3.8%
1 310
 
3.8%
5 307
 
3.7%
4 304
 
3.7%

Most occurring blocks

ValueCountFrequency (%)
ASCII 1128
100.0%
ValueCountFrequency (%)
ASCII 8237
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 564
50.0%
. 188
 
16.7%
6 188
 
16.7%
9 140
 
12.4%
3 48
 
4.3%
ValueCountFrequency (%)
0 2931
35.6%
6 1149
 
13.9%
9 871
 
10.6%
. 862
 
10.5%
3 522
 
6.3%
8 375
 
4.6%
2 314
 
3.8%
1 310
 
3.8%
5 307
 
3.7%
4 304
 
3.7%
 Original DatasetOversampled Dataset
Distinct2220
Distinct (%)1.1%25.5%
Missing00
Missing (%)0.0%0.0%
Memory size1.6 KiB13.5 KiB
32.3
140 
30.9
48 
32.3
478 
30.9
166 
32.34783906043541
 
1
30.87347925930446
 
1
32.280776397308834
 
1
Other values (215)
215 

Length

 Original DatasetOversampled Dataset
Max length418
Median length44
Mean length47.3805104
Min length44

Characters and Unicode

 Original DatasetOversampled Dataset
Total characters7526362
Distinct characters511
Distinct categories22 ?
Distinct scripts11 ?
Distinct blocks11 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

 Original DatasetOversampled Dataset
Unique0218 ?
Unique (%)0.0%25.3%

Sample

 Original DatasetOversampled Dataset
1st row30.932.34783906043541
2nd row30.932.283624138681425
3rd row30.932.337865651592175
4th row30.932.28298077255031
5th row30.932.2537409056809

Common Values

ValueCountFrequency (%)
32.3 140
74.5%
30.9 48
 
25.5%
ValueCountFrequency (%)
32.3 478
55.5%
30.9 166
 
19.3%
32.34783906043541 1
 
0.1%
30.87347925930446 1
 
0.1%
32.280776397308834 1
 
0.1%
32.306511899682114 1
 
0.1%
32.262919873546714 1
 
0.1%
32.284377190837326 1
 
0.1%
32.30019294664489 1
 
0.1%
32.33288657620021 1
 
0.1%
Other values (210) 210
24.4%

Length

2023-05-08T01:58:08.510690image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

Original Dataset

2023-05-08T01:58:08.595096image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Oversampled Dataset


Number of variable categories passes threshold (config.plot.cat_freq.max_unique)
ValueCountFrequency (%)
32.3 140
74.5%
30.9 48
 
25.5%
ValueCountFrequency (%)
32.3 478
55.5%
30.9 166
 
19.3%
30.814474082757403 1
 
0.1%
30.988164225060828 1
 
0.1%
30.92520628554013 1
 
0.1%
30.8970740223528 1
 
0.1%
30.873991364523544 1
 
0.1%
32.337865651592175 1
 
0.1%
32.28298077255031 1
 
0.1%
32.2537409056809 1
 
0.1%
Other values (210) 210
24.4%

Most occurring characters

ValueCountFrequency (%)
3 328
43.6%
. 188
25.0%
2 140
18.6%
0 48
 
6.4%
9 48
 
6.4%
ValueCountFrequency (%)
3 1697
26.7%
2 984
15.5%
. 862
13.5%
0 501
 
7.9%
9 457
 
7.2%
8 332
 
5.2%
1 314
 
4.9%
4 309
 
4.9%
6 305
 
4.8%
7 303
 
4.8%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 564
75.0%
Other Punctuation 188
 
25.0%
ValueCountFrequency (%)
Decimal Number 5500
86.5%
Other Punctuation 862
 
13.5%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
3 328
58.2%
2 140
24.8%
0 48
 
8.5%
9 48
 
8.5%
ValueCountFrequency (%)
3 1697
30.9%
2 984
17.9%
0 501
 
9.1%
9 457
 
8.3%
8 332
 
6.0%
1 314
 
5.7%
4 309
 
5.6%
6 305
 
5.5%
7 303
 
5.5%
5 298
 
5.4%
Other Punctuation
ValueCountFrequency (%)
. 188
100.0%
ValueCountFrequency (%)
. 862
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 752
100.0%
ValueCountFrequency (%)
Common 6362
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
3 328
43.6%
. 188
25.0%
2 140
18.6%
0 48
 
6.4%
9 48
 
6.4%
ValueCountFrequency (%)
3 1697
26.7%
2 984
15.5%
. 862
13.5%
0 501
 
7.9%
9 457
 
7.2%
8 332
 
5.2%
1 314
 
4.9%
4 309
 
4.9%
6 305
 
4.8%
7 303
 
4.8%

Most occurring blocks

ValueCountFrequency (%)
ASCII 752
100.0%
ValueCountFrequency (%)
ASCII 6362
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
3 328
43.6%
. 188
25.0%
2 140
18.6%
0 48
 
6.4%
9 48
 
6.4%
ValueCountFrequency (%)
3 1697
26.7%
2 984
15.5%
. 862
13.5%
0 501
 
7.9%
9 457
 
7.2%
8 332
 
5.2%
1 314
 
4.9%
4 309
 
4.9%
6 305
 
4.8%
7 303
 
4.8%
 Original DatasetOversampled Dataset
Distinct2220
Distinct (%)1.1%25.5%
Missing00
Missing (%)0.0%0.0%
Memory size1.6 KiB13.5 KiB
0.0323
140 
0.0309
48 
0.0323
478 
0.0309
166 
0.032268260776017396
 
1
0.030943580197493412
 
1
0.03232251169711175
 
1
Other values (215)
215 

Length

 Original DatasetOversampled Dataset
Max length620
Median length66
Mean length69.3851508
Min length66

Characters and Unicode

 Original DatasetOversampled Dataset
Total characters11288090
Distinct characters511
Distinct categories22 ?
Distinct scripts11 ?
Distinct blocks11 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

 Original DatasetOversampled Dataset
Unique0218 ?
Unique (%)0.0%25.3%

Sample

 Original DatasetOversampled Dataset
1st row0.03090.032268260776017396
2nd row0.03090.03229541837982759
3rd row0.03090.03224684230677582
4th row0.03090.03223572597987773
5th row0.03090.0323163591276323

Common Values

ValueCountFrequency (%)
0.0323 140
74.5%
0.0309 48
 
25.5%
ValueCountFrequency (%)
0.0323 478
55.5%
0.0309 166
 
19.3%
0.032268260776017396 1
 
0.1%
0.030943580197493412 1
 
0.1%
0.03232251169711175 1
 
0.1%
0.03231302897910061 1
 
0.1%
0.03227172213615031 1
 
0.1%
0.03227222592570511 1
 
0.1%
0.03229562766064959 1
 
0.1%
0.03225126470119788 1
 
0.1%
Other values (210) 210
24.4%

Length

2023-05-08T01:58:08.672146image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

Original Dataset

2023-05-08T01:58:08.756802image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Oversampled Dataset


Number of variable categories passes threshold (config.plot.cat_freq.max_unique)
ValueCountFrequency (%)
0.0323 140
74.5%
0.0309 48
 
25.5%
ValueCountFrequency (%)
0.0323 478
55.5%
0.0309 166
 
19.3%
0.03085860650387104 1
 
0.1%
0.030897142938943228 1
 
0.1%
0.030854761944449218 1
 
0.1%
0.030854547220732907 1
 
0.1%
0.030908430277787283 1
 
0.1%
0.03224684230677582 1
 
0.1%
0.03223572597987773 1
 
0.1%
0.0323163591276323 1
 
0.1%
Other values (210) 210
24.4%

Most occurring characters

ValueCountFrequency (%)
0 424
37.6%
3 328
29.1%
. 188
16.7%
2 140
 
12.4%
9 48
 
4.3%
ValueCountFrequency (%)
0 2197
27.2%
3 1681
20.8%
2 997
12.3%
. 862
 
10.7%
9 470
 
5.8%
5 339
 
4.2%
8 322
 
4.0%
4 320
 
4.0%
7 308
 
3.8%
6 299
 
3.7%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 940
83.3%
Other Punctuation 188
 
16.7%
ValueCountFrequency (%)
Decimal Number 7228
89.3%
Other Punctuation 862
 
10.7%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 424
45.1%
3 328
34.9%
2 140
 
14.9%
9 48
 
5.1%
ValueCountFrequency (%)
0 2197
30.4%
3 1681
23.3%
2 997
13.8%
9 470
 
6.5%
5 339
 
4.7%
8 322
 
4.5%
4 320
 
4.4%
7 308
 
4.3%
6 299
 
4.1%
1 295
 
4.1%
Other Punctuation
ValueCountFrequency (%)
. 188
100.0%
ValueCountFrequency (%)
. 862
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 1128
100.0%
ValueCountFrequency (%)
Common 8090
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
0 424
37.6%
3 328
29.1%
. 188
16.7%
2 140
 
12.4%
9 48
 
4.3%
ValueCountFrequency (%)
0 2197
27.2%
3 1681
20.8%
2 997
12.3%
. 862
 
10.7%
9 470
 
5.8%
5 339
 
4.2%
8 322
 
4.0%
4 320
 
4.0%
7 308
 
3.8%
6 299
 
3.7%

Most occurring blocks

ValueCountFrequency (%)
ASCII 1128
100.0%
ValueCountFrequency (%)
ASCII 8090
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 424
37.6%
3 328
29.1%
. 188
16.7%
2 140
 
12.4%
9 48
 
4.3%
ValueCountFrequency (%)
0 2197
27.2%
3 1681
20.8%
2 997
12.3%
. 862
 
10.7%
9 470
 
5.8%
5 339
 
4.2%
8 322
 
4.0%
4 320
 
4.0%
7 308
 
3.8%
6 299
 
3.7%

ink _density
Categorical

 Original DatasetOversampled Dataset
Distinct251
Distinct (%)1.1%5.9%
Missing00
Missing (%)0.0%0.0%
Memory size1.6 KiB13.5 KiB
1614
140 
1517
48 
1614
503 
1517
174 
1613
 
24
1612
 
16
1611
 
14
Other values (46)
131 

Length

 Original DatasetOversampled Dataset
Max length44
Median length44
Mean length44
Min length44

Characters and Unicode

 Original DatasetOversampled Dataset
Total characters7523448
Distinct characters510
Distinct categories11 ?
Distinct scripts11 ?
Distinct blocks11 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

 Original DatasetOversampled Dataset
Unique027 ?
Unique (%)0.0%3.1%

Sample

 Original DatasetOversampled Dataset
1st row15171620
2nd row15171621
3rd row15171613
4th row15171611
5th row15171614

Common Values

ValueCountFrequency (%)
1614 140
74.5%
1517 48
 
25.5%
ValueCountFrequency (%)
1614 503
58.4%
1517 174
 
20.2%
1613 24
 
2.8%
1612 16
 
1.9%
1611 14
 
1.6%
1616 14
 
1.6%
1615 14
 
1.6%
1610 10
 
1.2%
1617 9
 
1.0%
1519 7
 
0.8%
Other values (41) 77
 
8.9%

Length

2023-05-08T01:58:09.172154image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

Original Dataset

2023-05-08T01:58:09.251731image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Oversampled Dataset


Number of variable categories passes threshold (config.plot.cat_freq.max_unique)
ValueCountFrequency (%)
1614 140
74.5%
1517 48
 
25.5%
ValueCountFrequency (%)
1614 503
58.4%
1517 174
 
20.2%
1613 24
 
2.8%
1612 16
 
1.9%
1611 14
 
1.6%
1616 14
 
1.6%
1615 14
 
1.6%
1610 10
 
1.2%
1617 9
 
1.0%
1519 7
 
0.8%
Other values (41) 77
 
8.9%

Most occurring characters

ValueCountFrequency (%)
1 376
50.0%
6 140
 
18.6%
4 140
 
18.6%
5 48
 
6.4%
7 48
 
6.4%
ValueCountFrequency (%)
1 1702
49.4%
6 649
 
18.8%
4 512
 
14.8%
5 263
 
7.6%
7 187
 
5.4%
3 36
 
1.0%
2 32
 
0.9%
0 27
 
0.8%
8 21
 
0.6%
9 19
 
0.6%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 752
100.0%
ValueCountFrequency (%)
Decimal Number 3448
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
1 376
50.0%
6 140
 
18.6%
4 140
 
18.6%
5 48
 
6.4%
7 48
 
6.4%
ValueCountFrequency (%)
1 1702
49.4%
6 649
 
18.8%
4 512
 
14.8%
5 263
 
7.6%
7 187
 
5.4%
3 36
 
1.0%
2 32
 
0.9%
0 27
 
0.8%
8 21
 
0.6%
9 19
 
0.6%

Most occurring scripts

ValueCountFrequency (%)
Common 752
100.0%
ValueCountFrequency (%)
Common 3448
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
1 376
50.0%
6 140
 
18.6%
4 140
 
18.6%
5 48
 
6.4%
7 48
 
6.4%
ValueCountFrequency (%)
1 1702
49.4%
6 649
 
18.8%
4 512
 
14.8%
5 263
 
7.6%
7 187
 
5.4%
3 36
 
1.0%
2 32
 
0.9%
0 27
 
0.8%
8 21
 
0.6%
9 19
 
0.6%

Most occurring blocks

ValueCountFrequency (%)
ASCII 752
100.0%
ValueCountFrequency (%)
ASCII 3448
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1 376
50.0%
6 140
 
18.6%
4 140
 
18.6%
5 48
 
6.4%
7 48
 
6.4%
ValueCountFrequency (%)
1 1702
49.4%
6 649
 
18.8%
4 512
 
14.8%
5 263
 
7.6%
7 187
 
5.4%
3 36
 
1.0%
2 32
 
0.9%
0 27
 
0.8%
8 21
 
0.6%
9 19
 
0.6%

z_number
Categorical

 Original DatasetOversampled Dataset
Distinct2220
Distinct (%)1.1%25.5%
Missing00
Missing (%)0.0%0.0%
Memory size1.6 KiB13.5 KiB
4.852026516
140 
5.039073704
48 
4.852026516
478 
5.039073704
166 
4.856875888161474
 
1
5.03703986974655
 
1
4.851110241274762
 
1
Other values (215)
215 

Length

 Original DatasetOversampled Dataset
Max length1118
Median length1111
Mean length1112.518561
Min length1111

Characters and Unicode

 Original DatasetOversampled Dataset
Total characters206810791
Distinct characters1111
Distinct categories22 ?
Distinct scripts11 ?
Distinct blocks11 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

 Original DatasetOversampled Dataset
Unique0218 ?
Unique (%)0.0%25.3%

Sample

 Original DatasetOversampled Dataset
1st row5.0390737044.856875888161474
2nd row5.0390737044.84609410928064
3rd row5.0390737044.856909959843857
4th row5.0390737044.8536827443158765
5th row5.0390737044.838554282298314

Common Values

ValueCountFrequency (%)
4.852026516 140
74.5%
5.039073704 48
 
25.5%
ValueCountFrequency (%)
4.852026516 478
55.5%
5.039073704 166
 
19.3%
4.856875888161474 1
 
0.1%
5.03703986974655 1
 
0.1%
4.851110241274762 1
 
0.1%
4.855038331302581 1
 
0.1%
4.848046787564382 1
 
0.1%
4.850463780009655 1
 
0.1%
4.846939373768291 1
 
0.1%
4.85018017737832 1
 
0.1%
Other values (210) 210
24.4%

Length

2023-05-08T01:58:09.331094image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

Original Dataset

2023-05-08T01:58:09.419361image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Oversampled Dataset


Number of variable categories passes threshold (config.plot.cat_freq.max_unique)
ValueCountFrequency (%)
4.852026516 140
74.5%
5.039073704 48
 
25.5%
ValueCountFrequency (%)
4.852026516 478
55.5%
5.039073704 166
 
19.3%
5.0253831383447025 1
 
0.1%
5.040709795156401 1
 
0.1%
5.039974089921142 1
 
0.1%
5.048773466730579 1
 
0.1%
5.038596387555407 1
 
0.1%
4.856909959843857 1
 
0.1%
4.8536827443158765 1
 
0.1%
4.838554282298314 1
 
0.1%
Other values (210) 210
24.4%

Most occurring characters

ValueCountFrequency (%)
5 328
15.9%
0 284
13.7%
2 280
13.5%
6 280
13.5%
4 188
9.1%
. 188
9.1%
8 140
6.8%
1 140
6.8%
3 96
 
4.6%
7 96
 
4.6%
ValueCountFrequency (%)
5 1567
14.5%
0 1272
11.8%
6 1248
11.6%
2 1243
11.5%
4 1165
10.8%
8 916
8.5%
. 862
8.0%
1 743
6.9%
3 681
6.3%
7 601
 
5.6%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 1880
90.9%
Other Punctuation 188
 
9.1%
ValueCountFrequency (%)
Decimal Number 9929
92.0%
Other Punctuation 862
 
8.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
5 328
17.4%
0 284
15.1%
2 280
14.9%
6 280
14.9%
4 188
10.0%
8 140
7.4%
1 140
7.4%
3 96
 
5.1%
7 96
 
5.1%
9 48
 
2.6%
ValueCountFrequency (%)
5 1567
15.8%
0 1272
12.8%
6 1248
12.6%
2 1243
12.5%
4 1165
11.7%
8 916
9.2%
1 743
7.5%
3 681
6.9%
7 601
 
6.1%
9 493
 
5.0%
Other Punctuation
ValueCountFrequency (%)
. 188
100.0%
ValueCountFrequency (%)
. 862
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 2068
100.0%
ValueCountFrequency (%)
Common 10791
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
5 328
15.9%
0 284
13.7%
2 280
13.5%
6 280
13.5%
4 188
9.1%
. 188
9.1%
8 140
6.8%
1 140
6.8%
3 96
 
4.6%
7 96
 
4.6%
ValueCountFrequency (%)
5 1567
14.5%
0 1272
11.8%
6 1248
11.6%
2 1243
11.5%
4 1165
10.8%
8 916
8.5%
. 862
8.0%
1 743
6.9%
3 681
6.3%
7 601
 
5.6%

Most occurring blocks

ValueCountFrequency (%)
ASCII 2068
100.0%
ValueCountFrequency (%)
ASCII 10791
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
5 328
15.9%
0 284
13.7%
2 280
13.5%
6 280
13.5%
4 188
9.1%
. 188
9.1%
8 140
6.8%
1 140
6.8%
3 96
 
4.6%
7 96
 
4.6%
ValueCountFrequency (%)
5 1567
14.5%
0 1272
11.8%
6 1248
11.6%
2 1243
11.5%
4 1165
10.8%
8 916
8.5%
. 862
8.0%
1 743
6.9%
3 681
6.3%
7 601
 
5.6%

line_width
Real number (ℝ)

 Original DatasetOversampled Dataset
Distinct100160
Distinct (%)53.2%18.6%
Missing00
Missing (%)0.0%0.0%
Infinite00
Infinite (%)0.0%0.0%
Mean229.3617252.91415
 Original DatasetOversampled Dataset
Minimum112112
Maximum391394
Zeros00
Zeros (%)0.0%0.0%
Negative00
Negative (%)0.0%0.0%
Memory size1.6 KiB13.5 KiB
2023-05-08T01:58:09.532670image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Quantile statistics

 Original DatasetOversampled Dataset
Minimum112112
5-th percentile179183
Q1194210
median222.5254
Q3260294.75
95-th percentile305.65321.95
Maximum391394
Range279282
Interquartile range (IQR)6684.75

Descriptive statistics

 Original DatasetOversampled Dataset
Standard deviation43.83363149.024463
Coefficient of variation (CV)0.191111380.19383835
Kurtosis0.33187771-0.59429805
Mean229.3617252.91415
Median Absolute Deviation (MAD)31.542
Skewness0.575501920.10623233
Sum43120218012
Variance1921.38722403.398
MonotonicityNot monotonicNot monotonic
2023-05-08T01:58:09.683874image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
191 5
 
2.7%
194 4
 
2.1%
183 4
 
2.1%
193 4
 
2.1%
224 4
 
2.1%
203 4
 
2.1%
232 4
 
2.1%
204 4
 
2.1%
207 4
 
2.1%
185 4
 
2.1%
Other values (90) 147
78.2%
ValueCountFrequency (%)
305 17
 
2.0%
303 17
 
2.0%
288 16
 
1.9%
321 14
 
1.6%
232 13
 
1.5%
225 13
 
1.5%
259 12
 
1.4%
254 11
 
1.3%
185 11
 
1.3%
207 11
 
1.3%
Other values (150) 727
84.3%
ValueCountFrequency (%)
112 1
 
0.5%
123 1
 
0.5%
142 1
 
0.5%
163 1
 
0.5%
167 1
 
0.5%
176 1
 
0.5%
177 1
 
0.5%
178 1
 
0.5%
179 3
1.6%
180 2
1.1%
ValueCountFrequency (%)
112 2
 
0.2%
123 2
 
0.2%
142 2
 
0.2%
163 2
 
0.2%
167 3
0.3%
176 3
0.3%
177 3
0.3%
178 3
0.3%
179 6
0.7%
180 6
0.7%
ValueCountFrequency (%)
112 2
 
1.1%
123 2
 
1.1%
142 2
 
1.1%
163 2
 
1.1%
167 3
1.6%
176 3
1.6%
177 3
1.6%
178 3
1.6%
179 6
3.2%
180 6
3.2%
ValueCountFrequency (%)
112 1
 
0.1%
123 1
 
0.1%
142 1
 
0.1%
163 1
 
0.1%
167 1
 
0.1%
176 1
 
0.1%
177 1
 
0.1%
178 1
 
0.1%
179 3
0.3%
180 2
0.2%

overspray
Real number (ℝ)

 Original DatasetOversampled Dataset
Distinct119269
Distinct (%)63.3%31.2%
Missing00
Missing (%)0.0%0.0%
Infinite00
Infinite (%)0.0%0.0%
Mean104.83511140.2181
 Original DatasetOversampled Dataset
Minimum00
Maximum415419
Zeros821
Zeros (%)4.3%2.4%
Negative00
Negative (%)0.0%0.0%
Memory size1.6 KiB13.5 KiB
2023-05-08T01:58:09.841005image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Quantile statistics

 Original DatasetOversampled Dataset
Minimum00
5-th percentile13
Q11632
median59101
Q3169242
95-th percentile341.6367.8
Maximum415419
Range415419
Interquartile range (IQR)153210

Descriptive statistics

 Original DatasetOversampled Dataset
Standard deviation110.08344121.46756
Coefficient of variation (CV)1.05006270.86627591
Kurtosis0.28180928-0.92830923
Mean104.83511140.2181
Median Absolute Deviation (MAD)4988.5
Skewness1.13859650.59244514
Sum19709120868
Variance12118.36314754.368
MonotonicityNot monotonicNot monotonic
2023-05-08T01:58:09.986393image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0 8
 
4.3%
10 5
 
2.7%
91 5
 
2.7%
3 5
 
2.7%
7 4
 
2.1%
47 4
 
2.1%
32 4
 
2.1%
220 4
 
2.1%
24 3
 
1.6%
5 3
 
1.6%
Other values (109) 143
76.1%
ValueCountFrequency (%)
0 21
 
2.4%
10 17
 
2.0%
3 14
 
1.6%
47 14
 
1.6%
91 13
 
1.5%
32 12
 
1.4%
220 12
 
1.4%
201 11
 
1.3%
28 10
 
1.2%
7 10
 
1.2%
Other values (259) 728
84.5%
ValueCountFrequency (%)
0 8
4.3%
1 3
 
1.6%
2 3
 
1.6%
3 5
2.7%
4 1
 
0.5%
5 3
 
1.6%
6 1
 
0.5%
7 4
2.1%
8 2
 
1.1%
9 1
 
0.5%
ValueCountFrequency (%)
0 21
2.4%
1 6
 
0.7%
2 9
1.0%
3 14
1.6%
4 3
 
0.3%
5 7
 
0.8%
6 4
 
0.5%
7 10
1.2%
8 5
 
0.6%
9 3
 
0.3%
ValueCountFrequency (%)
0 21
11.2%
1 6
 
3.2%
2 9
4.8%
3 14
7.4%
4 3
 
1.6%
5 7
 
3.7%
6 4
 
2.1%
7 10
5.3%
8 5
 
2.7%
9 3
 
1.6%
ValueCountFrequency (%)
0 8
0.9%
1 3
 
0.3%
2 3
 
0.3%
3 5
0.6%
4 1
 
0.1%
5 3
 
0.3%
6 1
 
0.1%
7 4
0.5%
8 2
 
0.2%
9 1
 
0.1%

roughness
Real number (ℝ)

 Original DatasetOversampled Dataset
Distinct94133
Distinct (%)50.0%15.4%
Missing00
Missing (%)0.0%0.0%
Infinite00
Infinite (%)0.0%0.0%
Mean98.037234112.64037
 Original DatasetOversampled Dataset
Minimum4343
Maximum192228
Zeros00
Zeros (%)0.0%0.0%
Negative00
Negative (%)0.0%0.0%
Memory size1.6 KiB13.5 KiB
2023-05-08T01:58:10.140475image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Quantile statistics

 Original DatasetOversampled Dataset
Minimum4343
5-th percentile58.3563
Q17582
median91111
Q3117.25143
95-th percentile152.65163
Maximum192228
Range149185
Interquartile range (IQR)42.2561

Descriptive statistics

 Original DatasetOversampled Dataset
Standard deviation30.76604334.881524
Coefficient of variation (CV)0.313819980.3096716
Kurtosis0.11304357-0.72872629
Mean98.037234112.64037
Median Absolute Deviation (MAD)1931
Skewness0.733424340.17916157
Sum1843197096
Variance946.549411216.7207
MonotonicityNot monotonicNot monotonic
2023-05-08T01:58:10.303982image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
77 8
 
4.3%
68 8
 
4.3%
73 6
 
3.2%
99 6
 
3.2%
84 6
 
3.2%
75 5
 
2.7%
85 5
 
2.7%
108 5
 
2.7%
72 5
 
2.7%
117 4
 
2.1%
Other values (84) 130
69.1%
ValueCountFrequency (%)
77 23
 
2.7%
99 22
 
2.6%
73 19
 
2.2%
147 18
 
2.1%
68 18
 
2.1%
84 17
 
2.0%
145 17
 
2.0%
117 16
 
1.9%
85 16
 
1.9%
153 15
 
1.7%
Other values (123) 681
79.0%
ValueCountFrequency (%)
43 1
0.5%
44 1
0.5%
45 1
0.5%
48 2
1.1%
49 2
1.1%
54 1
0.5%
57 1
0.5%
58 1
0.5%
59 1
0.5%
60 1
0.5%
ValueCountFrequency (%)
43 1
 
0.1%
44 2
 
0.2%
45 4
0.5%
47 1
 
0.1%
48 5
0.6%
49 7
0.8%
51 1
 
0.1%
53 1
 
0.1%
54 3
0.3%
55 1
 
0.1%
ValueCountFrequency (%)
43 1
 
0.5%
44 2
 
1.1%
45 4
2.1%
47 1
 
0.5%
48 5
2.7%
49 7
3.7%
51 1
 
0.5%
53 1
 
0.5%
54 3
1.6%
55 1
 
0.5%
ValueCountFrequency (%)
43 1
0.1%
44 1
0.1%
45 1
0.1%
48 2
0.2%
49 2
0.2%
54 1
0.1%
57 1
0.1%
58 1
0.1%
59 1
0.1%
60 1
0.1%

Interactions

Original Dataset

2023-05-08T01:57:58.598362image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Oversampled Dataset

2023-05-08T01:58:05.652528image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Original Dataset

2023-05-08T01:57:55.381671image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Oversampled Dataset

2023-05-08T01:57:59.716361image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Original Dataset

2023-05-08T01:57:55.960935image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Oversampled Dataset

2023-05-08T01:58:01.010470image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Original Dataset

2023-05-08T01:57:56.477101image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Oversampled Dataset

2023-05-08T01:58:02.056285image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Original Dataset

2023-05-08T01:57:57.000481image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Oversampled Dataset

2023-05-08T01:58:03.102876image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Original Dataset

2023-05-08T01:57:57.512575image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Oversampled Dataset

2023-05-08T01:58:04.128686image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Original Dataset

2023-05-08T01:57:58.071666image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Oversampled Dataset

2023-05-08T01:58:05.030215image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Original Dataset

2023-05-08T01:57:58.686494image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Oversampled Dataset

2023-05-08T01:58:05.749103image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Original Dataset

2023-05-08T01:57:55.488870image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Oversampled Dataset

2023-05-08T01:57:59.850420image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Original Dataset

2023-05-08T01:57:56.040839image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Oversampled Dataset

2023-05-08T01:58:01.206818image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Original Dataset

2023-05-08T01:57:56.558264image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Oversampled Dataset

2023-05-08T01:58:02.212220image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Original Dataset

2023-05-08T01:57:57.078750image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Oversampled Dataset

2023-05-08T01:58:03.264663image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Original Dataset

2023-05-08T01:57:57.599818image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Oversampled Dataset

2023-05-08T01:58:04.300202image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Original Dataset

2023-05-08T01:57:58.153257image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Oversampled Dataset

2023-05-08T01:58:05.124752image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Original Dataset

2023-05-08T01:57:58.765497image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Oversampled Dataset

2023-05-08T01:58:05.840266image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Original Dataset

2023-05-08T01:57:55.565330image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Oversampled Dataset

2023-05-08T01:57:59.957552image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Original Dataset

2023-05-08T01:57:56.110843image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Oversampled Dataset

2023-05-08T01:58:01.351055image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Original Dataset

2023-05-08T01:57:56.630979image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Oversampled Dataset

2023-05-08T01:58:02.363950image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Original Dataset

2023-05-08T01:57:57.148741image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Oversampled Dataset

2023-05-08T01:58:03.410570image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Original Dataset

2023-05-08T01:57:57.677359image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Oversampled Dataset

2023-05-08T01:58:04.444707image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Original Dataset

2023-05-08T01:57:58.226837image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Oversampled Dataset

2023-05-08T01:58:05.215624image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Original Dataset

2023-05-08T01:57:58.843805image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Oversampled Dataset

2023-05-08T01:58:05.931097image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Original Dataset

2023-05-08T01:57:55.641852image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Oversampled Dataset

2023-05-08T01:58:00.094445image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Original Dataset

2023-05-08T01:57:56.181123image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Oversampled Dataset

2023-05-08T01:58:01.476287image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Original Dataset

2023-05-08T01:57:56.701792image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Oversampled Dataset

2023-05-08T01:58:02.542644image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Original Dataset

2023-05-08T01:57:57.219090image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Oversampled Dataset

2023-05-08T01:58:03.540838image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Original Dataset

2023-05-08T01:57:57.753953image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Oversampled Dataset

2023-05-08T01:58:04.580624image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Original Dataset

2023-05-08T01:57:58.297635image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Oversampled Dataset

2023-05-08T01:58:05.303176image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Original Dataset

2023-05-08T01:57:58.919455image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Oversampled Dataset

2023-05-08T01:58:06.018790image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Original Dataset

2023-05-08T01:57:55.718034image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Oversampled Dataset

2023-05-08T01:58:00.550386image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Original Dataset

2023-05-08T01:57:56.248732image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Oversampled Dataset

2023-05-08T01:58:01.583055image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Original Dataset

2023-05-08T01:57:56.771800image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Oversampled Dataset

2023-05-08T01:58:02.702373image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Original Dataset

2023-05-08T01:57:57.287402image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Oversampled Dataset

2023-05-08T01:58:03.663993image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Original Dataset

2023-05-08T01:57:57.828092image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Oversampled Dataset

2023-05-08T01:58:04.714877image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Original Dataset

2023-05-08T01:57:58.366410image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Oversampled Dataset

2023-05-08T01:58:05.385818image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Original Dataset

2023-05-08T01:57:59.004325image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Oversampled Dataset

2023-05-08T01:58:06.114415image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Original Dataset

2023-05-08T01:57:55.799476image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Oversampled Dataset

2023-05-08T01:58:00.710240image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Original Dataset

2023-05-08T01:57:56.325825image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Oversampled Dataset

2023-05-08T01:58:01.763575image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Original Dataset

2023-05-08T01:57:56.849656image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Oversampled Dataset

2023-05-08T01:58:02.808212image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Original Dataset

2023-05-08T01:57:57.363521image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Oversampled Dataset

2023-05-08T01:58:03.825155image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Original Dataset

2023-05-08T01:57:57.909984image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Oversampled Dataset

2023-05-08T01:58:04.826625image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Original Dataset

2023-05-08T01:57:58.445461image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Oversampled Dataset

2023-05-08T01:58:05.474793image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Original Dataset

2023-05-08T01:57:59.083235image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Oversampled Dataset

2023-05-08T01:58:06.205127image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Original Dataset

2023-05-08T01:57:55.876606image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Oversampled Dataset

2023-05-08T01:58:00.874649image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Original Dataset

2023-05-08T01:57:56.397016image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Oversampled Dataset

2023-05-08T01:58:01.921638image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Original Dataset

2023-05-08T01:57:56.920177image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Oversampled Dataset

2023-05-08T01:58:02.943798image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Original Dataset

2023-05-08T01:57:57.433386image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Oversampled Dataset

2023-05-08T01:58:03.954957image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Original Dataset

2023-05-08T01:57:57.987002image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Oversampled Dataset

2023-05-08T01:58:04.926913image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Original Dataset

2023-05-08T01:57:58.516122image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Oversampled Dataset

2023-05-08T01:58:05.560541image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Correlations

2023-05-08T01:58:10.437754image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
nozzle_voltagedrop_spacingtimevelocityline_widthoversprayroughnessprint_heightdistanceink_visco_cpink_visco_passurface_tension_dyne_cmsurface_tension_n_mink _densityz_number
nozzle_voltage1.000-0.0200.1080.9680.2580.0480.0990.1370.5620.0840.0840.0840.0840.0840.084
drop_spacing-0.0201.0000.067-0.041-0.758-0.276-0.5080.0000.0000.0000.0000.0000.0000.0000.000
time0.1080.0671.0000.023-0.042-0.067-0.1220.4390.6870.2750.2750.2750.2750.2750.275
velocity0.968-0.0410.0231.0000.3000.0620.1360.2720.4820.2780.2780.2780.2780.2780.278
line_width0.258-0.758-0.0420.3001.0000.2900.6190.0350.0000.0000.0000.0000.0000.0000.000
overspray0.048-0.276-0.0670.0620.2901.0000.2290.2050.0000.1980.1980.1980.1980.1980.198
roughness0.099-0.508-0.1220.1360.6190.2291.0000.1600.2020.2710.2710.2710.2710.2710.271
print_height0.1370.0000.4390.2720.0350.2050.1601.0000.1490.9040.9040.9040.9040.9040.904
distance0.5620.0000.6870.4820.0000.0000.2020.1491.0000.1510.1510.1510.1510.1510.151
ink_visco_cp0.0840.0000.2750.2780.0000.1980.2710.9040.1511.0000.9860.9860.9860.9860.986
ink_visco_pas0.0840.0000.2750.2780.0000.1980.2710.9040.1510.9861.0000.9860.9860.9860.986
surface_tension_dyne_cm0.0840.0000.2750.2780.0000.1980.2710.9040.1510.9860.9861.0000.9860.9860.986
surface_tension_n_m0.0840.0000.2750.2780.0000.1980.2710.9040.1510.9860.9860.9861.0000.9860.986
ink _density0.0840.0000.2750.2780.0000.1980.2710.9040.1510.9860.9860.9860.9861.0000.986
z_number0.0840.0000.2750.2780.0000.1980.2710.9040.1510.9860.9860.9860.9860.9861.000

Missing values

Original Dataset

2023-05-08T01:57:59.229634image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
A simple visualization of nullity by column.

Oversampled Dataset

2023-05-08T01:58:06.361053image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
A simple visualization of nullity by column.

Original Dataset

2023-05-08T01:57:59.419178image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Oversampled Dataset

2023-05-08T01:58:06.621138image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

Original Dataset

print_heightnozzle_voltagedrop_spacingdistancetimevelocityink_visco_cpink_visco_passurface_tension_dyne_cmsurface_tension_n_mink _densityz_numberline_widthoversprayroughness
080025827034.07.9416.30.006330.90.030915175.03907429412164
180025927034.07.9416.30.006330.90.030915175.039074261136141
2800251030038.07.8956.30.006330.90.030915175.03907421811103
3800251130044.06.8186.30.006330.90.030915175.0390741901568
4800251230041.07.3176.30.006330.90.030915175.0390741909190
5750251330040.07.5006.90.006932.30.032316144.852027180062
6800251530038.07.8956.90.006932.30.032316144.8520271788082
7800251630043.06.9776.30.006330.90.030915175.03907418524145
8800251730043.06.9776.30.006330.90.030915175.03907421350161
980028830034.08.8246.30.006330.90.030915175.0390743238171

Oversampled Dataset

print_heightnozzle_voltagedrop_spacingdistancetimevelocityink_visco_cpink_visco_passurface_tension_dyne_cmsurface_tension_n_mink _densityz_numberline_widthoversprayroughness
0702361188567.70325914.1389366.8887430.00690432.3478390.03226816204.856876285104118
1702371193262.30741914.2559356.8934000.00687832.2836240.03229516214.84609428799115
2696371190559.19898314.2466006.8873530.00692332.3378660.03224716134.85691028499119
3703371191660.30434914.0214666.8912410.00690332.2829810.03223616114.85368328899117
4698361089062.48921214.2790976.9106230.00689032.2537410.03231616144.838554283102118
564631989394.8630619.7886476.8957730.00688032.3071030.03225316164.83989529011386
6648311090397.7124079.3885056.9394510.00689432.2649050.03230616144.84638629010284
764731991092.9863329.7954036.8758290.00690632.2964040.03229516174.83855528710684
8650291090099.5251539.0706556.9000000.00690032.3000000.03230016144.852027295105116
965430988094.9392209.4366206.9160640.00690132.3504650.03232316154.84697628611581

Original Dataset

print_heightnozzle_voltagedrop_spacingdistancetimevelocityink_visco_cpink_visco_passurface_tension_dyne_cmsurface_tension_n_mink _densityz_numberline_widthoversprayroughness
1786502817900108.08.3333336.90.006932.30.032316144.85202721228272
17965031890093.09.6774196.90.006932.30.032316144.85202732347157
18065031990093.09.6774196.90.006932.30.032316144.852027305201108
181650311090094.09.5744686.90.006932.30.032316144.85202728810785
182650311190095.09.4736846.90.006932.30.032316144.85202729024115
183650311290096.09.3750006.90.006932.30.032316144.8520272621794
184650311390096.09.3750006.90.006932.30.032316144.8520272411586
185650311490096.09.3750006.90.006932.30.032316144.8520271917787
1866503116900108.08.3333336.90.006932.30.032316144.852027188173
1876503117900107.08.4112156.90.006932.30.032316144.852027203545

Oversampled Dataset

print_heightnozzle_voltagedrop_spacingdistancetimevelocityink_visco_cpink_visco_passurface_tension_dyne_cmsurface_tension_n_mink _densityz_numberline_widthoversprayroughness
1776502816900108.08.3333336.90.006932.30.032316144.8520271918199
17965031890093.09.6774196.90.006932.30.032316144.85202732347157
18065031990093.09.6774196.90.006932.30.032316144.852027305201108
181650311090094.09.5744686.90.006932.30.032316144.85202728810785
182650311190095.09.4736846.90.006932.30.032316144.85202729024115
183650311290096.09.3750006.90.006932.30.032316144.8520272621794
184650311390096.09.3750006.90.006932.30.032316144.8520272411586
185650311490096.09.3750006.90.006932.30.032316144.8520271917787
1866503116900108.08.3333336.90.006932.30.032316144.852027188173
1876503117900107.08.4112156.90.006932.30.032316144.852027203545

Duplicate rows

Original Dataset

print_heightnozzle_voltagedrop_spacingdistancetimevelocityink_visco_cpink_visco_passurface_tension_dyne_cmsurface_tension_n_mink _densityz_numberline_widthoversprayroughness# duplicates
Dataset does not contain duplicate rows.

Oversampled Dataset

print_heightnozzle_voltagedrop_spacingdistancetimevelocityink_visco_cpink_visco_passurface_tension_dyne_cmsurface_tension_n_mink _densityz_numberline_widthoversprayroughness# duplicates
065025830041.07.3170736.90.006932.30.032316144.85202722711783
165025930042.07.1428576.90.006932.30.032316144.85202726646863
3650251130040.07.5000006.90.006932.30.032316144.8520272233913
6650251630038.07.8947376.90.006932.30.032316144.85202717616543
7650251730038.07.8947376.90.006932.30.032316144.8520271923773
8650288900108.08.3333336.90.006932.30.032316144.8520273081611313
9650289900107.08.4112156.90.006932.30.032316144.8520272794151443
126502811900105.08.5714296.90.006932.30.032316144.8520273031051473
146502813900107.08.4112156.90.006932.30.032316144.8520272932011223
156502814900107.08.4112156.90.006932.30.032316144.85202719472763